Tesla K80
4つ載せると Deepseek-r1 70B がうごくらしい
@skoroneos: Deepseek-r1 70B running on 4 K80 (96GB NVRAM) GPU's and Ollama. It takes 70% of memory https://pbs.twimg.com/media/GiTBjWkWQAAtDVK.png
Ubuntu 22.04 でCUDA 11.4を動かしてる例
CUDAのインストール
CUDAのインストールは概ねこの手順でできた
CUDA Driver
code:sh
code:sh
sudo dpkg -i nvidia-driver-local-repo-ubuntu2004-470.256.02_1.0-1_amd64.deb
code:sh
sudo cp /var/nvidia-driver-local-repo-ubuntu2004-470.256.02/nvidia-driver-local-DF02B125-keyring.gpg /usr/share/keyrings/
code:sh
sudo apt update && sudo apt upgrade
code:sh
sudo reboot
CUDA Toolkit
code:sh
code:sh
mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
code:sh
code:sh
sudo dpkg -i cuda-repo-ubuntu2004-11-4-local_11.4.4-470.82.01-1_amd64.deb
code:sh
sudo apt-key add /var/cuda-repo-ubuntu2004-11-4-local/7fa2af80.pub
code:sh
sudo apt-get updatesudo apt-get -y install cuda
code:sh
sudo reboot
上記手順でGUIがはいってしまうので、GUIを起動しないようにする
code:sh
sudo systemctl set-default multi-user.target
CUDA 470 をいれる
ビルドする
code:sh
export CMAKE_PREFIX_PATH=/usr/local/cuda-11.4
cmake -B build
cmake --build build
llama.cpp をうごかす